Fast Statistical Grammar Induction

نویسندگان

  • Wide R. Hogenhout
  • Yuji Matsumoto
چکیده

The statistical induction of context free grammars from bracketed corpora with the Inside Outside Algorithm has often inspired researchers, but the computational complexity has made it impossible to generate a large scale grammar. The method we suggest achieves the same results as earlier research, but at a much smaller expense in computer time. We explain the modifications needed to the algorithm, give results of experiments and compare these to results reported in other literature.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Induction of Greedy Controllers for Deterministic Treebank Parsers

Most statistical parsers have used the grammar induction approach, in which a stochastic grammar is induced from a treebank. An alternative approach is to induce a controller for a given parsing automaton. Such controllers may be stochastic; here, we focus on greedy controllers, which result in deterministic parsers. We use decision trees to learn the controllers. The resulting parsers are surp...

متن کامل

Grammar Induction by Distributional Clustering with the Fragment Constituency Criterion

This paper proposes that the identification of constituents, which is the core problem in grammar induction, can be accomplished by a simple constituency criterion in linguistics: a word/tag sequence which can occur as a fragment is a constituent. Experiment results show that grammar induction by distributional clustering augmented with this criterion achieves good PARSEVAL scores and improves ...

متن کامل

یک مدل بیزی برای استخراج باناظر گرامر زبان طبیعی

In this paper, we show that the problem of grammar induction could be modeled as a combination of several model selection problems. We use the infinite generalization of a Bayesian model of cognition to solve each model selection problem in our grammar induction model. This Bayesian model is capable of solving model selection problems, consistent with human cognition. We also show that using th...

متن کامل

Introduction to the Special Topic on Grammar Induction, Representation of Language and Language Learning

Grammar induction refers to the process of learning grammars and languages from data; this finds a variety of applications in syntactic pattern recognition, the modeling of natural language acquisition, data mining and machine translation. This special topic contains several papers presenting some of recent developments in the area of grammar induction and language learning, as applied to vario...

متن کامل

Synchronous Constituent Context Model for Inducing Bilingual Synchronous Structures

Traditional Statistical Machine Translation (SMT) systems heuristically extract synchronous structures from word alignments, while synchronous grammar induction provides better solutions that can discard heuristic method and directly obtain statistically sound bilingual synchronous structures. This paper proposes Synchronous Constituent Context Model (SCCM) for synchronous grammar induction. Th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996